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Abstract 

We revisit the game in which each of several players chooses a pat¬ 
tern and then a coin is flipped repeatedly until one of these patterns is 
generated. In particular, we demonstrate how to compute the proba¬ 
bility of any one player winning this game, and find the distribution of 
the game’s duration. Our presentation is an extension (and perhaps 
a simplification) of the results of Blom and Thornburn [T] . 


1 Introduction 

Blom and Thornburn in [1] outline an experiment where a coin or a die is 
flipped or rolled until a particular pattern (such as TTH or 123) is generated. 
Different patterns can then be ‘played against’ one another by designating 
the winning pattern as the first one that is generated. In [1] however, it is 
assumed that the coin is fair and that the patterns are all of the same length. 
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In this paper we remove these restrictions; we also aim our presentation at a 
wider audience by simplifying the key dehnitions and ideas. 

We build our patterns out of only two possible characters (Tand H) be¬ 
cause the extension to using three or more is trivial. 

2 Pattern Generation 

Consider performing a sequence of independent trials by flipping a poten¬ 
tially biased coin. Each trial results in a ‘head’ (denoted H in this article) 
with a probability p, or ‘tail’ (t) with probability q = l—p. Flips continue 
until a specihc pattern of k consecutive outcomes (e.g. HHTHT) is gener¬ 
ated. In this article, we use calligraphic capital letters (such as S) to denote 
a pattern and small letters (such as S 1 S 2 • • • Sfc) for its individual symbols. 

Definition 2.1 (First-Time Generation). Let fi denote the probability that 
completing a specihc pattern S of length k for the first time will happen 
at Trial i. Thereby fi = 0 when i < k, including i = 0. The probability 
generating function (PGF) of the corresponding fi sequence is 

00 

m = E 

i=0 

The expected (or mean) value of the number of trials to generate the 
pattern is then given by F'{z = 1). 

2.1 A key formula 

Assume now that trials are repeated indehnitely and generate an ever in¬ 
creasing number of occurrences of pattern S. 

Definition 2.2. Let Ui denote the probability that Pattern S is completed at 
Trial i but not necessarily for the hrst time. Here Ui = 0 when 0 < i < k, but 
Mo = 1 (we give the rationale below). Let U{z) be the generating function of 
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the Mo, Ml, M 2 , • • • sequence. Note that this generating function is not a PGF 
since the m* probabilities do not add up to 1. 

Dehnition 12.21 requires an important proviso: consecutive occurrences of 
the pattern are not allowed to overlap. Each time Pattern S is generated, the 
process is reset and the next completion has to start ‘from scratch’, with none 
of the symbols of any one occurrence being available to build the next one. 
This implies that not every subsequence of the S 1 S 2 • • • Sfc symbols (we call it 
String S, to emphasize the difference) counts as a completion of Pattern S. 

Let us now assume that exactly n trials have been completed (this be¬ 
comes our SAMPLE space) and let us expand m„ according to the trial of 
the hrst occurrence of Pattern S (thus dehning a partition of the sample 
space). Using the total-probability formula, we deduce 

Un fo^n T fl^n—1 T ' ' ' T fn'^0 (1) 

where each term of the right-hand side, say fiUn-i, represents the probability 
of completing the first occurrence of S at Trial i, followed by generating yet 
another (not necessarily second) occurrence of S at the end of the remaining 
n — i trials. The hrst term with i = 0 is always zero; the last one accounts 
for the possibility of no prior occurrence of S, thus explaining the necessity 
of choosing mq = 1. Since the terms represent all possibilities of what can 
happen to complete S at Trial m, their sum must equal the left-hand side. 

Notice that the RHS of ([1]) is equal to the coefficient of in the expansion 
of F{z) ■ U{z), known as the convolution of the two sequences. It is also 
important to realize that ([T]) is correct for m > 1 but not when n = 0. 
Multiplying each side of ([I]) by z^ and summing from 1 to oo thus yields 

U{z) - 1 = F{z) ■ U{z) 

where the ‘—1’ is to account for the missing mq on the LHS. Solving for F{z) 
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yields 


^ ^ c/(^) - 1 


-1 


( 2 ) 


F{z) = 


Ujz) - 1 
1 + {U{z) - 1) 


2.2 Pattern Overlaps 

For our presentation to be concise and readable, we introduce some opera¬ 
tions on strings. Namely, we frequently need to know if and when the last few 
symbols of a pattern coincide with the hrst few symbols of another pattern. 

Example 2.1. TTTHTTT overlaps with TTHTTTTHT only at shift 1, 2 and 6 
because (red indicates the overlap) 


Shift 1 TTTHTTT 

TTHTTTTHT 

Shift 2 TTTHTTT 

TTHTTTTHT 

Shift 6 TTTHTTT 

TTHTTTTHT 


are the only three possibilities. 

And similarly, reversing the order now, TTHTTTTHT overlaps with TTTHTTT 
overlap only at shift 1 and 5: 

Shift 1 TTHTTTTHT 

TTTHTTT 

Shift 5 TTHTTTTHT 

TTTHTTT 


Definition 2.3 (in). Let IH be a non-commutative function with two string 
arguments which returns the set of shifts at which the first patten overlaps 
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with the second one. Namely, when S = Si - ■ ■ Sk and W = Wi - ■ ■ Wm then 


5 □ W := {i I Sk-i+i ■ ■ ■ Sk = Wi - ■ ■ Wi). 

This set is empty when there are no such overlaps. Note that this dehnition 
implies i < max(/c,m). 

Definition 2.4 (@). Let @ be a non-commutative function with two string 
arguments which returns the largest shift possible for an S and W overlap. 

iS ® W := max(i G 5 □ W) 


where 5 ® W := 0 when 5 □ W = 0. 


Definition 2.5 (n). Let □ be a non-commutative function with two string 
arguments which returns the substring corresponding to the longest S and 
W overlap: 

iS n W := tCi • • -tCsow 

(iS n W := Sfc+i_ 5 @vv • • • Sfc is an equivalent definition). Note that 5 □ W is 
a zero-length string when iS ® W = 0. 

Example 2.2. Assuming that S = TTTHTTT and >V = TTHTTTTHT (the 
strings of Example 12.11) we get 


5h>V = {1,2,6} 
5® >V = 6 
5 n >V = TTHTTT 
= 11,2,3,7} 


WbS = {1,5} 
>V®5 = 5 
w n 5 = TTTHT 
VP 0^ = {1,4,9} 


2.3 Finding U{z) and F{z) 

Using the same sample space of n trials, consider the probability of obtaining 
the symbols Si • • • at the last k of these n trials. This does not necessarily 
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mean that Pattern S has been completed at Trial n, since such an occurrence 
of String S (as we call it, to differentiate from Pattern S) may overlap with 
an earlier completion of the actual pattern and therefore not count as another 
completion of Pattern S. 

Denote the probability of generating String S in (any) k consecutive trials 
by Ps- It is clearly given by 


Ps=p^{l-p)’^-^ ( 3 ) 

where m and k — m are (resp.) the number of H and T symbols in S. 

Now, we expand the probability of String S occurring at Trial n (which 
is equal to Ps for any n > k) according to the trial at which Pattern S has 
been been completed during the last k trials, thus: 

Ps — 'y ^ Un-k+e • -Ps^+1---Sfc (4) 

where ■ ■ ■ Sk = £ (the empty string) and P^ = 1. In this context, one 
must understand that when the last k trials contain the symbols Si • • - s^. 
Pattern S must have been completed at exactly one of these trials, and that 
only Trials n — k + i with £ G iS [H iS are eligible. 

Multiplying each side of (0]) by and summing over n from k to oo yields 

OO 

E p \ ^ n-k-\-t 

^ ^ ^n-k+e ^ 

ieS{I]S n=k 

oo 

E E 

m=0 

{U(Z) - 1) ■ E 


Ps • 
1-z 


since 1 < £ < k and ui = ■ ■ ■ = Uk-i = 0. Thus, in the Ylm=o'^rn+£ ■ 
summation, we are always missing uq, but are including the rest of the U{z) 
expansion. 
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Based on (|2]), we get 


F ( z ) = (l + (1 - z) . j (5) 

and can thereby find the PGF of the number of trials to generate, for the 
first time, any given pattern. 

Example 2.3. Let S = TTHTTTTHT, as in the previous example. We get 

f W = b + (1 - z). 

By expanding this PGF in powers of 2 ;, we can extract the individual fi 
probabilities (as coefficients of z^). For instance /ig = — 

p^q'^) is the probability that S is generated, for the first time, in exactly 18 
trials. 


p^q^z^ + pq'^z^ + 1 


4^5 


p‘iq7z^ 


-1 


3 Playing m Patterns 

Let us now assume that we have a collection of m patterns iSi, ..., Sm, 
none of them being a substring of any other. The question is: What is 
the probability that Si (from this set) is completed before any of the other 
patterns, thereby ‘beating’ them and ‘winning’ this game? 

Definition 3.1. Let be the probability of a first time occurrence of St 
(or ‘Pattern i’ for short) at Trial n, without being preceded by any of the 
competing patterns. In other words, this is the probability of Si beating all 
other patterns in exactly n trials. For consistency we set Xi^ = 0. 

3.1 Head-start probabilities 

Recall that the string denoted Si consists of the same symbols as the pattern 
now denoted i. Gompletion of String Si does not necessarily imply completion 
of Pattern i. 
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Definition 3.2. Let denote the probability of a first-time completion of 
Pattern i at Trial n (what was /„ of Section [2]). Furthermore, let //'I] be the 
same probability, this time assuming that Pattern j has just been completed 
(prior to starting our trials) and that we are allowed to use any of its symbols 
to help us generate Pattern i. 

Obviously Pattern i cannot utilize more than Sj@Si symbols of Pattern 
j; having iSj-FliSj as our ‘head start’ thus results in the same probabilities. 
Let Fi|j( 2 ;) denote their PGF. 

To generate Si from scratch, one must hrst generate SjUSi and then, 
independently (which implies convolution of the corresponding PGFs) the 
rest of the pattern. Denoting the PGF of St FI Sj by Fjni(2:) we get 


Fi{z) = 


easily solved for Fi\j{z) . Note that Fj(^) and Fjni(2;) can be found by utilizing 
Equation ([5]). 

3.2 Solving for „ 

Assuming again that a hxed number of n independent trials have been per¬ 
formed, we can expand according to the trial (say in which Pattern 
j has won the game. Notice that going over all values of £ and j creates an 
obvious partition of our samples space. So, 

m n 

• /S-£ + (6) 

^=1 

as Pattern i can only win at Trial n. 

Multiplying ([6]) by z^ and summing over n from 0 to cxd converts this 
inhnite set of equations into a single statement involving the corresponding 


generating functions. Namely: 


f,w = 5^a'j(j)-FhiW + .y,w 
j^i 

m 

= J2XiU)-FmU) m 

j=l 

by setting Fi|i( 2 ;) = 1. 

Dividing ([7]) by Fi(z) converts it into 

m 

( 8 ) 

j = l 

which has a simple solution, given by 

X = F-^l (9) 

where 

i. X is a vector of the X-i{z), ..., Xin(z) generating functions, 

ii. 1 is a length-m vector with each component equal to 1, and 
hi. F is the following matrix 


1 1 1 


Fini( 

1 


.^2ni( 

1 

t) 

.^mni( 

1 

t) 

.^102 ( 


.^2n2( 

t) 

.^mn2( 

t) 

1 


1 


1 


-^inm 

(^) 

-^2nm 

(^) 

-^mrim 

(^) 


That is Fj^j = l/Tjni with i specifying the row index and j the column 
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index (note the index reversal). 


To find the probability of Pattern i winning the game in any number of 
trials, we need to evaluate X-^^z) at z = 1. Since the substitution leads to an 
indefinite expression, we have to replace it by the corresponding z ^ 1 limit 
(an easy task for a modern computer). 


Example 3.1. For the patterns S = TTTHTTT and W = TTHTTTTHT of 
Example 12.21 we have the following F; 


^ _ , pq^z^ + pq'^z^ + pq^z"^ + 1 

1 + (1 - - ^ - 

pq'^z' 

. + pq^z^ + 1 

1 + (1 - ^)-- 

pqOz'o 


1 + ( 1 - 

! + (!-;.)• 




pq^z^ + 1 
pq'^z^ 


p^q^z^ + pq'^z^ + 1 
p2qTz^ 


resulting, with the help of (j9]), in 


hmX = 

z^l 


1 — p^q^ 

1 + p^q — p^q^ 

p^q{l + q^) 

1 + p^q — p^q^ 


( 10 ) 


When p = \, this limit evaluates to 87.32% and 12.68% respectively, cor¬ 
responding to the probability of S and W winning the game using a fair 
coin. 


3.3 The game’s duration 

The PGF of the number of trials to complete the game is clearly given by 

m 

(11) 

j=i 

which can be readily expanded in z to yield the corresponding probability 
of the game ending in exactly i trials (the coefficient of z*). To get the 
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game’s expected duration, we need to differentiate (fTTD with respect to z and 
evaluate the result at z = 1 (again, this will require taking the z —)■ 1 limit). 
Similarly, we can hnd the variance, skewness, and so on, of the corresponding 
distribution. 

Example 3.2. The game of Example 13.11 has expected duration given by 


1 + q^{l — q^) + + p^(l + p)g® 

pg®(l + p^q — p^q^) 


lim d(Xi(z)+X2(^)) 


( 12 ) 


dz 


This yields the average number of 128.3 trials when p = ^- Similarly, we get 
±122.0 trials for the standard deviation. 

3.4 Alternate solution 

There is an interesting way to bypass ([9]) when Ending the probability of a 
Pattern winning the game. 

Imagine playing the game repeatedly, without ever stopping; let yi^n de¬ 
note the probability that Pattern i wins the game at Trial n. Provided we 
have played sufficiently many games to have reached the so-called equilib¬ 
rium, the probability of Pattern i winning a game is the same for all n, 
namely 


yi = lim Pi 


The probability of a game ending at Trial n is also the same for all (sufficiently 
large) n, and is equal to 2/i- The expected duration of a single game 
must then be the reciprocal of this value, namely 


Similarly, the probability of Pattern j winning a game is 



( 14 ) 
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This will always agree with lim;j_^i Xi{z) of Equation ([2]). 


3.4.1 Solving for t/j 

To hnd these probabilities, let k be the length of Pattern j and consider 
k consecutive trials after reaching equilibrium; label these Trial 1 through 
Trial k. Then expand the probability of generating the symbols of Pattern 
j (i.e. of String Sj) in these k trials, broken down according to the pattern 
which won the game at Trial ^ {exactly one pattern must have won during 
these k trials). 

Remembering that the probability of generating String 5) = si • • • is 
Psp we get by summing over all patterns and all ‘eligible’ trials: 

m 

^Sj = ( 15 ) 

i=l ItSiSSj 

where is the probability of the corresponding string being completed 

during the last k — i trials. 

Doing this for each of the m patterns provides m linear equations for ?/i, 
2 / 2 , • • •, 2/m- Solving them and substituting back into (ITT)) and (fT3|) yields, 
respectively, the probability of Patter j winning, and the expected game’s 
duration. 

Example 3.3. For S = TTTHTTT and >V = TTHTTTTHT, the set of equations 
flTHl) reads 


pq^ + pg^ + pg^ + 1 

i 

(M 

+ 


2/1 


1 - 

p2g6 _|_ p2q5 _|_ pg2 

p2g6 _|_ pg4: _|_ ^ 


2/2 


2 7 

p q 


resulting in 

pq^{l — p^q^) 

1 + g^(l — g^) + + p2(l + p)g® 
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+ q^) 

1 + q^{l — q^) + + p^{l + p)q^ 

which leads to flT^ and flTUD via flT^ and ffn|) . 

The advantage of this approach is that it bypasses the nse of PGFs, 
allowing ns to deal directly with nnmbers, rather than polynomials. This 
difference matters greatly when inverting matrices. Its shortcoming is its 
inability to compnte more than the probabilities of winning the game and 
the expected valne of the game’s dnration. 

3.5 Three Characters and Beyond 

When using more than three characters (e.g. 1, 2, ..., 6 with a die), the 
only formula which changes, in a rather obvious manner, is ([3]). For example, 
having three distinct possibilities for a symbol (say H, T, and R), we would 
now get 

-Gs - Ph Pt Pr 

where mn, mj, and tur is the number of symbols of each type found in 5, and 
Ph; Pt; and pr are the corresponding individual probabilities. The remaining 
formulas require no modification. 
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